Goto

Collaborating Authors

 boundary segment


Complex fractal trainability boundary can arise from trivial non-convexity

Liu, Yizhou

arXiv.org Artificial Intelligence

Training neural networks involves optimizing parameters to minimize a loss function, where the nature of the loss function and the optimization strategy are crucial for effective training. Hyperparameter choices, such as the learning rate in gradient descent (GD), significantly affect the success and speed of convergence. Recent studies indicate that the boundary between bounded and divergent hyperparameters can be fractal, complicating reliable hyperparameter selection. However, the nature of this fractal boundary and methods to avoid it remain unclear. In this study, we focus on GD to investigate the loss landscape properties that might lead to fractal trainability boundaries. We discovered that fractal boundaries can emerge from simple non-convex perturbations, i.e., adding or multiplying cosine type perturbations to quadratic functions. The observed fractal dimensions are influenced by factors like parameter dimension, type of non-convexity, perturbation wavelength, and perturbation amplitude. Our analysis identifies "roughness of perturbation", which measures the gradient's sensitivity to parameter changes, as the factor controlling fractal dimensions of trainability boundaries. We observed a clear transition from non-fractal to fractal trainability boundaries as roughness increases, with the critical roughness causing the perturbed loss function non-convex. Thus, we conclude that fractal trainability boundaries can arise from very simple non-convexity. We anticipate that our findings will enhance the understanding of complex behaviors during neural network training, leading to more consistent and predictable training strategies.


Perceptual Organization Based on Temporal Dynamics

Liu, Xiuwen, Wang, DeLiang L.

Neural Information Processing Systems

A figure-ground segregation network is proposed based on a novel boundary pair representation. Nodes in the network are boundary segments obtained through local grouping. Each node is excitatorily coupled with the neighboring nodes that belong to the same region, and inhibitorily coupled with the corresponding paired node. Gestalt grouping rules are incorporated by modulating connections. The status of a node represents its probability being figural and is updated according to a differential equation.


Perceptual Organization Based on Temporal Dynamics

Liu, Xiuwen, Wang, DeLiang L.

Neural Information Processing Systems

A figure-ground segregation network is proposed based on a novel boundary pair representation. Nodes in the network are boundary segments obtained through local grouping. Each node is excitatorily coupled with the neighboring nodes that belong to the same region, and inhibitorily coupled with the corresponding paired node. Gestalt grouping rules are incorporated by modulating connections. The status of a node represents its probability being figural and is updated according to a differential equation.


Perceptual Organization Based on Temporal Dynamics

Liu, Xiuwen, Wang, DeLiang L.

Neural Information Processing Systems

A figure-ground segregation network is proposed based on a novel boundary pair representation. Nodes in the network are boundary segmentsobtained through local grouping. Each node is excitatorily coupledwith the neighboring nodes that belong to the same region, and inhibitorily coupled with the corresponding paired node. Gestalt grouping rules are incorporated by modulating connections. Thestatus of a node represents its probability being figural and is updated according to a differential equation.